nn accelerator
Fast, Scalable, Energy-Efficient Non-element-wise Matrix Multiplication on FPGA
Zhu, Xuqi, Zhang, Huaizhi, Lee, JunKyu, Zhu, Jiacheng, Pal, Chandrajit, Saha, Sangeet, McDonald-Maier, Klaus D., Zhai, Xiaojun
Modern Neural Network (NN) architectures heavily rely on vast numbers of multiply-accumulate arithmetic operations, constituting the predominant computational cost. Therefore, this paper proposes a high-throughput, scalable and energy efficient non-element-wise matrix multiplication unit on FPGAs as a basic component of the NNs. We firstly streamline inter-layer and intra-layer redundancies of MADDNESS algorithm, a LUT-based approximate matrix multiplication, to design a fast, efficient scalable approximate matrix multiplication module termed "Approximate Multiplication Unit (AMU)". The AMU optimizes LUT-based matrix multiplications further through dedicated memory management and access design, decoupling computational overhead from input resolution and boosting FPGA-based NN accelerator efficiency significantly. The experimental results show that using our AMU achieves up to 9x higher throughput and 112x higher energy efficiency over the state-of-the-art solutions for the FPGA-based Quantised Neural Network (QNN) accelerators.
Ultra-Low Power Keyword Spotting at the Edge
Ulkar, Mehmet Gorkem, Okman, Osman Erman
Keyword spotting (KWS) has become an indispensable part of many intelligent devices surrounding us, as audio is one of the most efficient ways of interacting with these devices. The accuracy and performance of KWS solutions have been the main focus of the researchers, and thanks to deep learning, substantial progress has been made in this domain. However, as the use of KWS spreads into IoT devices, energy efficiency becomes a very critical requirement besides the performance. We believe KWS solutions that would seek power optimization both in the hardware and the neural network (NN) model architecture are advantageous over many solutions in the literature where mostly the architecture side of the problem is considered. In this work, we designed an optimized KWS CNN model by considering end-to-end energy efficiency for the deployment at MAX78000, an ultra-low-power CNN accelerator. With the combined hardware and model optimization approach, we achieve 96.3\% accuracy for 12 classes while only consuming 251 uJ per inference. We compare our results with other small-footprint neural network-based KWS solutions in the literature. Additionally, we share the energy consumption of our model in power-optimized ARM Cortex-M4F to depict the effectiveness of the chosen hardware for the sake of clarity.
Entropy-Based Modeling for Estimating Soft Errors Impact on Binarized Neural Network Inference
Khoshavi, Navid, Sargolzaei, Saman, Roohi, Arman, Broyles, Connor, Bi, Yu
Over past years, the easy accessibility to the large scale datasets has significantly shifted the paradigm for developing highly accurate prediction models that are driven from Neural Network (NN). These models can be potentially impacted by the radiation-induced transient faults that might lead to the gradual downgrade of the long-running expected NN inference accelerator. The crucial observation from our rigorous vulnerability assessment on the NN inference accelerator demonstrates that the weights and activation functions are unevenly susceptible to both single-event upset (SEU) and multi-bit upset (MBU), especially in the first five layers of our selected convolution neural network. In this paper, we present the relatively-accurate statistical models to delineate the impact of both undertaken SEU and MBU across layers and per each layer of the selected NN. These models can be used for evaluating the error-resiliency magnitude of NN topology before adopting them in the safety-critical applications.
AxTrain: Hardware-Oriented Neural Network Training for Approximate Inference
He, Xin, Ke, Liu, Lu, Wenyan, Yan, Guihai, Zhang, Xuan
The intrinsic error tolerance of neural network (NN) makes approximate computing a promising technique to improve the energy efficiency of NN inference. Conventional approximate computing focuses on balancing the efficiency-accuracy trade-off for existing pre-trained networks, which can lead to suboptimal solutions. In this paper, we propose AxTrain, a hardware-oriented training framework to facilitate approximate computing for NN inference. Specifically, AxTrain leverages the synergy between two orthogonal methods---one actively searches for a network parameters distribution with high error tolerance, and the other passively learns resilient weights by numerically incorporating the noise distributions of the approximate hardware in the forward pass during the training phase. Experimental results from various datasets with near-threshold computing and approximation multiplication strategies demonstrate AxTrain's ability to obtain resilient neural network parameters and system energy efficiency improvement.
[D] NN accelerator. Need resources/links/articles about relevance to society/industry • r/MachineLearning
Hello Everyone I am electronics and computer engineering student interested in machine learning. Together with other students we have to decide on semester project theme and present it to supervisor. I would like to work on analog neural network accelerator but I have to present how the "problem" which aforementioned device solves is relevant to industry or society. For example that artificial neural networks require substantial amount of power to operate and there is potential to reduce it. Or that there is raising demand for NN accelerators.